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Abstract — We consider a system of parallel queues where 
tasks are assigned (dispatched) to one of the available servers 
upon arrival. The dispatching decision is based on the full state 
information, i.e., on the sizes of the new and existing jobs. We are 
interested in minimizing the so-called mean slowdown criterion 
corresponding to the mean of the sojourn time divided by the 
processing time. Assuming no new jobs arrive, the shortest- 
processing-time-product (SPTP) schedule is known to minimize 
the slowdown of the existing jobs. The main contribution of 
this paper is three-fold: 1) To show the optimality of SPTP 
with respect to slowdown in a single server queue under Poisson 
arrivals; 2) to derive the so-called size-aware value functions for 
M/G/1-FIFO/LIFO/SPTP/SPT/SRPT with general holding costs of 
which the slowdown criterion is a special case; and 3) to utilize 
the value functions to derive efficient dispatching policies so as to 
minimize the mean slowdown in a heterogeneous server system. 
The derived policies offer a significantly better performance than 
e.g., the size-aware-task-assignment with equal load (SITA-E) and 
least-work-left (LWL) policies. 

I. Introduction 

Dispatching problems arise in many contexts such as man- 
ufacturing sites, web server farms, super computing systems, 
and other parallel server systems. In a dispatching system 
jobs are assigned upon arrival to one of the several queues 
as illustrated in Fig. [T] Such systems involve two decisions: 
(i) dispatching policy a chooses the server, and (ii) schedul- 
ing discipline the order in which the jobs are served. The 
dispatching decisions are irrevocable, i.e., it is not possible to 
move a job to another queue afterwards. In the literature, the 
dispatching problems and their solutions differ with respect to 
(i) optimization objective, (ii) available information, and (iii) 
scheduling discipline used in the servers. 

Although the literature has generally addressed the mean 
sojourn time (i.e., response time) as the optimization objective, 
also other performance metrics can be relevant. One such 
metric is the .s/owiiovvnH of a job, defined as the ratio of the 
sojourn time and the processing time (service requirement) 
ifTTl . |91 , QJ. The slowdown criterion combines efficiency 
and fairness and stems from the idea that longer jobs can 
tolerate longer sojourn time. The optimal scheduling discipline 
in this respect for a single server queue and a fixed number of 
jobs is the so-called shortest-processing-time-producr\(SPTP) 

t This is the full version (including the appendix) of the paper with the 
same title that appears in the ACM SIGMETRICS 2012, London, UK. 

1 Yang and de Veciana refer to the slowdown as the bit-transmission delay 
(BTD) (30). 

2 Wierman et al. refer to SPTP as the RS policy, a product of the remaining 
size and the original size 1281 . 
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Fig. 1. A dispatching system with m = 3 servers. 



discipline |30|, where the index of a job is the product of the 
initial and remaining service requirements and the job with 
the smallest index is served first. With aid of Gittins index, 
we show that SPTP minimizes the mean slowdown in single 
server queues also in the dynamic case of Poisson arrivals, i.e., 
it is the optimal discipline for an M/G/l queue with respect 
to the slowdown. 

Then we derive value functions with respect to arbitrary job 
specific holding costs for an M/G/l with the first-in-first-out 
(FIFO), the last-in-first-out (LIFO), the shortest-processing- 
time (SPT), the shortest-remaining-processing-time (SRPT) 
and SPTP scheduling disciplines, where the last three are size- 
aware. A value function essentially characterizes the queue 
state in terms of the expected costs in infinite time horizon 
for a fixed scheduling discipline. In this respect, our work 
generalizes the results of [19| to arbitrary job specific holding 
cost rates, and includes also the analysis of the slowdown 
specific SPTP scheduling discipline. 

Finally, we apply the derived value functions to the dis- 
patching problem so as to minimize the mean slowdown when 
the scheduling disciplines in each queue are fixed. We assume 
that the dispatcher is fully aware of the state of the system, 
i.e., of the tasks in each queue and their remaining service 
requirements. By starting with an arbitrary state-independent 
policy, we carry out the first-policy-iteration (FPI) step of the 
Markov decision processes (MDP) and obtain efficient state- 
dependent dispatching policies. 

The rest of the paper is organized as follows. In Section [Tl] 
we consider a single M/G/l queue with respect to slowdown 

we 



criterion, and prove the optimality of SPTP. In Section III 



derive the size-aware value functions with respect to arbitrary 
job specific holding cost rates for FIFO, LIFO and SPTP 
(SPT and SRPT are given in the Appendix). The single queue 



scheduling related results are utilized in Section IV to derive 
efficient dispatching policies, which are then evaluated in 
Section W\ Section VI concludes the paper. 



A. Related Work 

FIFO is perhaps the most common scheduling discipline 
due to its nature and ease of implementation. Other common 
disciplines are LIFO, SPT and SRPT. As for the objective, 
minimization of the mean sojourn time has been a popular 
choice. Indeed, SPT and SRPT, respectively, are the optimal 
non-preemptive and preemptive schedules with this respect 
E5l . For a recent survey on fairness and other scheduling 
objectives in a single server queue we refer to 11271 . 

In the context of dispatching problems, FIFO has been 
studied extensively in the literature since the early work by 
Winston [29|, Ephremides et al. [8|, and others. Often the 
number of tasks per server is assumed to be known, cf., 
e.g., join-the-shortest-queue (JSQ) dispatching policy |29|. 
Even though FIFO queues have received the most of the 
attention, also other scheduling disciplines have been studied. 
For example, Gupta et. al consider JSQ with processor-sharing 
(PS) scheduling discipline in [12]. 

Only a few optimality results are known for the dispatch- 
ing problems. Assuming exponentially distributed interarrival 
times and job sizes, [29 1 shows that JSQ with FIFO minimizes 
the mean waiting time when the number in each queue is 
available. Also [8] argues for the optimality of JSQ/FIFO 
when the number in each queue is available, while the Round- 
Robin (RR), followed by FIFO, is shown to be the optimal 
policy when it is only known that the queues were initially 
in the same state. [24 1 proves that RR/FIFO is optimal with 
the absence of queue length information if the job sizes have 
a non-decreasing hazard function. The RR results were later 
generalized in [23]. Whitt l26l . on the other hand, provides 
several counterexamples where JSQ/FIFO policy fails. Crov- 
ella et al. [61 and Harchol-B alter et al. lfT4ll assume that the 
dispatcher is aware of the size of a new job, but not of the 
state of the FIFO queues, and propose policies based on job 
size intervals (e.g., short jobs to one queue, and the rest to 
another). Feng et al. iflOl later showed that such a policy 
is the optimal size-aware state-independent dispatching policy 
for homogeneous servers. 

Also the MDP framework lends itself to dispatching prob- 
lems. Krishnan [22 j has utilized it in the context of parallel 
M/M/s-FIFO servers so as to minimize the mean sojourn 
time, similarly as Aalto and Virtamo Q for the traditional 
M/M/l queue. Recently, FIFO, LIFO, SPT and SRPT queues 
were analyzed in lfl9l with a general service time distribution. 
Similarly, PS is considered in 1211 . 1201 . The key idea with 
the above work is to start by an arbitrary state-independent 
policy, and then carry out FPI step utilizing the value functions 
(relative values of states). 

II. M/G/l Queue and Slowdown 

In this section we consider a single M/G/l queue with arrival 
rate A. The service requirements are i.i.d. random variables 
Xi ~ X with a general distribution. 

We define the slowdown of a job as a ratio of the sojourn 



time T to the service requirement X, 

A T 



7 : 
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where one is generally interested in its mean value E[ 7 ]. Also 
other similar definitions exist. In the context of distributed 
computing, Harchol-B alter defines the slowdown as the "wall- 
time" divided by the CPU-time in [15|. Another common 
convention is to consider the ratio of the waiting time to the 
processing time (see, e.g., Q~3 1), which is a well-defined quan- 
tity for non-preemptive systems. These different definitions, 
however, are essentially the same. 

For non-preemptive work conserving policies, the sojourn 
time in queue comprises the initial waiting time W and the 
consequent processing time X. Thus, the mean slowdown is 

W + X 
E[ 7 ] - E[ V¥ 1 ] = 1 + E[W/X}. 

Waiting time with FIFO is independent of the job size X, and 
with aid of Pollaczek-Khinchin formula one obtains l!T3l . 

A ELY 21 



E[ 7 ] = 1 + E[W] -ELY" 1 ] = 1 



ELY" 1 ], (1) 



2(1 -p) 

which underlines the fact that the mean slowdown may be 
infinite for a stable queue as E[Y" _1 ] = 00 for many common 
job size distributions, e.g., for an exponential distribution. 

The mean slowdown in an M/G/l queue with preemptive 
LIFO and PS disciplines is a constant for all job sizes, 

E[ 7 I X = x] = — !— . (2) 

1-/0 

Comparison of ([T| and |2]l immediately gives: 

Corollary 1: FIFO is a better scheduling discipline than 

LIFO in an M/G/l queue with respect to slowdown iff 

2 ELY] > ELY 2 ] -ELY" 1 ]. (3) 

A. Shortest-Processing-Time-Product (SPTP) 

In 11301 , Yang and de Veciana introduced the shortest- 
processing-time-product (SPTP) scheduling discipline, which 
will serve the job i* such that 

i* = argminA*Aj, 

i 

where A* and A^ denote, respectively, the initial and re- 
maining service requirement of job i. Yang and de Veciana 
were able to show that SPTP is optimal with respect to the 
mean slowdown E[ 7 ] in a transient system where all jobs are 
available at time and no new jobs arrive thereafter (i.e., a 
myopic approach). SPTP can be seen as the counterpart of the 
SRPTrl when instead of minimizing the mean sojourn time, 
one is interested in minimizing the mean slowdown. 

We argue below that SPTP is the optimal scheduling disci- 
pline also in the corresponding dynamic system with Poisson 
arrivals. That is, we show that SPTP is optimal with respect 
to the slowdown in the M/G/l queue. Our proof is based on 
the Gittins index approach. 



3 Note that defining an index policy with indices A^, A* and ^/A^A* 
gives SRPT, SPT and SPTP, respectively, where y / A7A* corresponds to the 
geometric mean of A,; and A* . 



B. Gittins Index 

Consider an M/G/l queue with multiple job classes k, k — 
1, . . . , K, Let Afe denote the class-fc arrival rate, and p k = 
Afe/A the probability that an arriving job belongs to class k. In 
addition, let X k denote the generic service time related to class 
k. The total load is denoted by p = X 1 E[X 1 } + . . .+X K E[X K }. 
We assume that p < 1, The Gittins index ifTTI . Q for a class-fc 
job with attained service a is defined by 

P{X k -a<6\X k >a} 



G fc (o) 



supw fc . , , , 

S >0 E[rmn{A fc - a, 6} \ X k > a\ 



where w k is the holding cost rate related to class k. Note that, 
in addition to the attained service a, the Gittins index is based 
on the (class-specific) distribution of the service time, but not 
on the service time itself. The Gittins index policy will serve 
the job i* such that 

i* = argmaxGfc^ai), 

i 

where ki refers to the class and a. L to the attained service of 
job i. Let T k denote the sojourn time of a class-/c job. We 
recall the following optimality result proved by Gittins [11., 
Theorem 3.28]. 

Theorem 1: Consider an M/G/l multi-class queue. The 
Gittins index policy minimizes the mean holding costs, 



^2p k w k E[T k ], 



among the non-anticipating scheduling policies. 

The non-anticipating policies are only aware of the attained 
service times at of each job i and the service time distributions 
(but not on the actual service times). So, in general, non- 
anticipating policies do not know the remaining service times 
implying that, e.g., SPTP does not belong to non-anticipating 
disciplines. However, there is one exception: If the service 
times X k are deterministic, P{X k = x k } = 1, then the 
remaining service times x k — a k are available for the non- 
anticipating policies (with probability 1). In such a case, all 
policies are non-anticipating. 

C. Optimality of SPTP 

Consider now an M/G/l queue with a single class and load 
p < 1. Let X and T denote, respectively, the generic service 
and sojourn times of a job. In addition, let t{x) denote the 
conditional mean sojourn time of a job with service time x, 

t(x)=E[T\X = x}. 

Assume first that the support of the service time distribution 
is finite. In other words, there are x\ < . . . < xk sucn that 
J2 k P{X = Xk} = 1. The following result is an immediate 
consequence of Theorem[T]as soon as we associate class k with 
the jobs with the same service time requirement x k . Note that 
the Gittins index is now clearly given by 

w k 
Xk - a 
with the optimal S equal to x k — a. 



G k (a) 



Corollary 2: Consider an M/G/l queue for which the 
support of the service time distribution is finite. Among all 
scheduling policies, the mean holding costs, 



^ P{X = X k }w k T(x k ), 



are minimized by the policy that will serve job i* such that 

. A, 
i = arg mm ■ 



Wk{i 



where k(i) and Aj denote the class and the remaining service 
requirement of job i. 

In particular, if we choose w k = l/x k , we see that the mean 
slowdown is minimized by SPTP: 

Corollary 3: Consider an M/G/l queue for which the sup- 
port of the service time distribution is finite. SPTP minimizes 
the mean slowdown E[y] among all scheduling policies. 

Since any service time distribution can be approximated 
with an arbitrary precision by discrete service time distribu- 
tions, we expect that the above result holds also for general 
service time distributions. Note also that the choice w k = 1, 
for all k, results in the well-known optimality result of SRPT 
with respect to the mean sojourn time. 

III. Value Functions for M/G/l 

In this section we analyze a single M/G/l queue in isolation 
with FIFO, LIFO, and SPTP scheduling disciplines and derive 
expressions for the size-aware value functions with respect to 
arbitrary job specific holding costs. The corresponding results 
for SPT and SRPT are also given in the Appendix. As special 
cases, one trivially obtains the value functions with respect to 
the mean sojourn time and the mean slowdown. Thus, these 
results generalize the corresponding results in |[T9ll , where the 
size-aware value functions with respect to sojourn time are 
given (except for SPTP). 

Holding cost model: Consider an M/G/l queue with an 
arbitrary but fixed work conserving scheduling discipline and 
load p < 1. Each existing task incurs costs at some task 
specific holding cost rate denoted by Bi for job i. Assume 
that Bi are i.i.d. random variables, Bi ~ B. However, Bi may 
depend on the corresponding service requirement X,. Both the 
service requirement and the holding cost become known upon 
arrival. 

The cumulative cost during (0,£) for an initial state z is 

Jo ie-A4(s) 

where J\f z (s) denotes the set of tasks present in the system at 
time s. Similarly, the long-term mean cost rate is 

r z 4 lim I E[K(i)] = lim - E[ / V B % ds], 

t-foo t ' t-+oo t In — ' 

J0 teA/"»(s) 

which for an ergodic Markovian system reads 
r = lim E[ V BA. 



The value function v z is defined as the expected deviation 
from the mean cost rate in infinite time-horizon, 

v z = lim E[VJt)-r-i\. 

t— foo 

A value function characterizes how expensive it is to start 
from each state. In our setting, it enables one to compute the 
expected cost of admitting a job with size x and holding cost 
rate b to a queue, which afterwards behaves as an M/G/l queue 
with the given scheduling discipline: 



U x (x,b) = U z ©( x ,6) ~Vx, 



(4) 



where z (x, b) denotes the state resulting from adding a 



new job (x, b) in state z. In Section IV we will carry out the 



FPI step in the context of dispatching problem, for which a 
corresponding value function is a prerequisite. 

Size dependent holding cost: In general, the holding cost 
rate b can be arbitrary, e.g., a class-specific i.i.d. random vari- 
able. However, in two important special cases, mean sojourn 
time and mean slowdown, it depends solely on the service 
requirement x,b= c(x): 



mean sojourn time: 
mean slowdown: 



c(x) = 1 
c{x) = 1/x 



In equilibrium, the average cost per job is E[c(X) • T], Using 
Little's result, we have for the mean cost rate, 

r = AE[ C (X).T] = (^=E[Ar], for c( X ) = 1 

L v ' ' [ AE[7j, for c(x) = 1/x. 

where E[N] denotes the mean number in the system. 

A. M/G/l -FIFO Queue 

Next we derive the size-aware value function for an M/G/l - 
FIFO queue with respect to arbitrary job specific holding 
costs. To this end, let z = ((A 1; b\)\ . . . ; (A„, &„)) denote the 
remaining service requirements (measured in time) Aj and the 
corresponding holding cost rates bt at state z. The total rate 
at which costs are accrued at given state z is thus the sum 
J2i^i- J°b 1 is currently receiving service and job n is the 
latest arrival. The total backlog is denoted by u z , 

n 

u z = y^ a,. 

i=l 

Proposition 1: For the size-aware relative value in an 
M/G/l -FIFO queue with respect to arbitrary job specific 
holding costs it holds that 



vq 



i=\ 






A • E[B] 



(5) 



where E[B] is the mean holding cost rate for all later jobs. 

Proof: We compare two systems with the same arrival 
patterns, System 1 initially in state z and System 2 initially 
empty. The two systems behave equivalently after System 



same arrival pattern to two systems 



System 1 

initially in state z: 



System 2 

initially empty: 




(mini) busy period 



Fig. 2. Derivation of the value function in an M/G/1-FIFO queue. 



1 becomes empty. The cost incurred by the current n jobs, 
present only in System 1, is already fixed as 

n / i 

i=i \ j=i 

The later arriving jobs encounter a longer waiting time in Sys- 
tem 1. The key observation here is that these jobs experience 
an additional delay of Y in System 1 when compared to Sys- 
tem 2, as is illustrated in Fig. [2] Otherwise the sojourn times 
are equal. On average Au z (mini) busy periods occur before 
System 1 becomes empty. These busy periods are independent 
and on average 1/(1 — p) jobs are served during each of 
them. The average additional waiting time is E[Y] = u z /2. 
Therefore, the later arriving jobs incur on average 

A ■ E[B] a 

higher holding cost in System 1 than in System 2. Total cost 
difference is h\ + h%, which completes the proof. ■ 

Corollary 4: The mean cost in terms of sojourn time due 
to accepting a job with size x to a size-aware M/G/l -FIFO 
queue initially at state z is 

LU z (x) = X + U Z + — r(2u z X + X 2 ). 

2(1 - p) 

Corollary 5: The mean cost in terms of slowdown due to 
accepting a job with size a; to a size-aware M/G/1-FIFO queue 
initially at state z is 



w z (z) = H — - + 



a • Eps:- 1 

2(1 -P) 



-{2u z x + x 2 ). 



(6) 



In both cases, it is implicitly assumed that the question about 
the admittance is a one-time operation and the future behavior 
of the queue is according to the standard M/G/1-FIFO. The 
proofs follow trivially from Q and Q. 

B. M/G/1-LIFO Queue 

Next we derive the size-aware value function for the LIFO 
scheduling discipline. To this end, let us denote the state of the 
system with n jobs by a vector z = ((Ai, &i); . . . ; (A n , b n )), 
where A, denotes the remaining service requirement (mea- 
sured in time) and bi the holding cost rate of job i. Job 1 
is the latest arrival and currently receiving service (if any), 



i.e., without new arrivals the jobs are processed in the natural 
order: 1, 2, . . . ,n. 

Proposition 2: For the size-aware relative value in a pre- 
emptive M/G/1-LIFO queue with respect to arbitrary job 
specific holding cost it holds that 



1 



^o 



1-P 




bij^Ai 



(7) 



Proof: We compare System 1 initially in state z and 
System 2 initially empty. The current state bears no meaning 
for the future arrivals with preemptive LIFO, and thus the 
difference in the expected costs is equal to the cost the current 
n jobs in System 1 incur: 



vq 



i=\ 



6 4 E[ft 



where Ri denotes the remaining sojourn time of job i lfl9l . 



EUAj 



l-p ' 

which completes the proof. ■ 

For the mean sojourn time, the holding cost rate is constant 
bi = 1 and a sufficient state description is (Ai, . . . , A n ). 

Corollary 6: The cost in terms of sojourn time due to 
accepting a job with size x to a size-aware preemptive M/G/l - 
LIFO queue at state z is 

w z (x) = (n + l)x. 



1 



9 



For the slowdown, the initial service requirements define 
the holding costs and a sufficient state description is z = 
((Ai, AJ), .., (Ai, A*)), where Aj and A* denote the remain- 
ing and initial service requirement of job i, so that the holding 
cost of job i is bi = 1/A*: 

Corollary 7: The cost in terms of slowdown due to accept- 
ing a job with size a; to a size-aware preemptive M/G/1-LIFO 
queue at state z is 



WzO) 



1 



1-P 



1 



E 



*/A* 



(8) 



The proofs follow trivially from |7]) and from definition Q. 
Note that ui z (x) with (preemptive) LIFO does not depend on 
the remaining service times Aj due to the preemption. 

C. M/G/1-SPTP Queue 

Next we derive the corresponding size-aware value function 
for the SPTP scheduling discipline. The state description 
for SPTP is z = ((Ai,AJ,6i), ..., (A„,A;,6„)), where, 
without loss of generality, we assume a decreasing priority 
order, A^A* < Aj+iA*, x , i.e., the job 1 (if any) is currently 
receiving service. Then, u z (h) denotes the amount of work 
with a higher priority than h, 



u z (h) = 



i:AiA'<h 



A,. 



Arriving higher priority jobs 
interrupt the service of the present jobs 

a + + + W , I 



Wailing 
u z (A A*) 




Fig. 3. Remaining sojourn time of an (A, A* )-job in an M/G/1-SPTP queue. 



Let X(x), m(x) and p(x) denote the arrival rate, mean job 
size and offered load due to jobs shorter than x, 



X(x) = XP{X<x}, 
m(x) = E[X | X < x] 
p(x) ±Xf*tf(t)dt, 



(9) 



where f(x) denotes the job size pdf. 

Lemma 1: The mean remaining sojourn time of a (A, A*) 
job in an M/G/1-SPTP queue initially in state z is given by 



E[i? z (A,A*)] 




w z (AA 



-p(VAA 



(10) 



Proof: Let k denote the (A, A*)-job, whose remaining 
sojourn time depends on the initial and later arriving higher 
priority work. We can assume that the latter are served 
immediately according to LIFO, thus triggering mini busy 
periods. Let h = h(t) denote the remaining service time of 
job k at (virtual) time t, where we have omitted these mini 
busy periods from the time axis. During < t < u z (AA*), 
job k is waiting and h = A, but as the service begins h — > 
linearly. The later arriving higher priority jobs shorter than 
y/hA* constitute an inhomogeneous Poisson process with rate 
X(y/hA*), as illustrated in Fig.pl The mean duration of a mini 
busy period is (cf. the mean busy period in M/G/l), 



£>(/!) = 



i(VhA* 



i - P {Vh^) ' 



The mean waiting time before job k receives service for the 
first time is it z (AA*) + X(\/hA*)u z (AA*) ■ D(A), which 
gives the first term in (jT0]l. 

The service time A and the additional delays during the 



service are on average 



A- 



\(VhA*) ■ D{h) dh; 



refer to the "in service" region of Fig. [5] Change of integration 
variable, x — y/hA* then gives the second term in ( fTO] ), which 
completes the proof. ■ 

Even though SPTP explicitly tries to minimize the mean 
slowdown, we derive next a general expression for the value 
function with arbitrary job specific holding costs: 

Proposition 3: For the size-aware relative value in an 
M/G/l -SPTP queue with respect to arbitrary job specific 



same arrival pattern to two systems a J ob wj t n an initially higher priority 

, index than x 2 becomes a higher 
1 priority job as the remaining 
service requirement decreases 



I i I 



Uz(>0 - 
System 1 

initially in state z 



System 2 

initially empty: 




Fig. 4. Derivation of the value function for an M/G/1-SPTP queue with respect to slowdown. 



holding costs it holds that 
v z -vq = } j bA - 
A 



eS ^ 



, , , ..-p(Ai) A* y l-p(x) 



A, 



:rfx 



=0 



E (£A; 



i=i 



E ^r 2 



,, , A '+* 6(g) /(a) 
a, (I-PW) 2 
A * +1 x 4 6(a;)/(x) 

A, 



(1-P(z)) 2 



fix 



(11) 



where 6(x) = E[B \ X = x] and 

f 0, z = 

A,; = I ^/AiA,*, » = l,...,n 
[ oo, j = n + 1. 

Proof: The relative value comprises the mean holding cost 
/ii accrued by the current n tasks in state z, 

z = ((Ai, A*, &!),..., (A„, A;, 6„)), 

and the difference in costs accrued by the later arriving jobs, 
h 2 , v z — v — hi + h 2 . The first summation over n gives 
hi, i.e., it follows from multiplying ( fT0| ) with the job specific 
holding cost bi and adding over all i. 

The latter integral corresponds to hi where the accrued 
costs are conditioned on the size x of an arriving job. The 
corresponding arrival rate is Xf(x)dx. The mean difference 
in sojourn time experienced by an arriving job with size x is 
accrued during the initial waiting time - once a job enters the 
service for the first time its mean remaining sojourn time is 
the same in both systems. The initial waiting time is a function 
of higher priority workload upon arrival, denoted by U z (x 2 , t) 
for an initial state z at arrival time t. In particular, the mean 
(initial) waiting time is simply Fi[U z (x 2 ,t)]/(l~ p(x)), which 
gives 



ho = \ 



fix) 



. (E[U x (x 2 ,t)-U (x 2 ,t)})dx. 

Next we refer to Fig. Hand observe that U z (x 2 ,t) — Uo(x 2 , t) 
in the integrand corresponds to area A. This area consists of 
one u z (x 2 ) triangle, followed by iVo rectangles, and similar se- 
quences starting with a x 2 /A* triangle for {i : A^A* > x 2 }, 
each followed by Ni rectangles. The iVj are random variables 
corresponding to the number of mini busy periods during 



the service time of a particular triangle. The first triangle 
corresponds to the initially higher priority workload u z (x 2 ), 
and the latter to the jobs with initially lower priority, i.e., to 
jobs i with A^A* > x 2 . At some point in time, the remaining 
service requirement of such a job i decreases to y^ = x 2 /A* 
and its priority index drops below x 2 . For a sequence starting 
with a y-triangle, the number of mini busy periods (rectangles) 
obeys Poisson distribution with mean X(x) y. The height of 
a rectangle is uniformly distributed in (0, y) having the mean 
y/2 (property of Poisson process), while the width corresponds 
to the duration of a busy period in a work conserving M/G/l 
queue having the mean m(x)/(l — p(x)). Thus, the mean total 
area is 



T_ 

2 



\(x)y 



m(x) y 2 

l-p(x) = 2(1 -p(x)Y 



Therefore, 



h 2 = 



b{x)f{x) 

2 7 (1-P0r)) 2 
o 



u z [x 2 



E 



i:AiA*>x 2 



dx. 



where b(x) = E[B | X = x] factor in the numerator 
corresponds to the mean holding cost of an x-job. For each 
interval x £ (A i7 A i+1 ), u z (x 2 ) — E,=i Aj and 




E 



(A*)" 



and integration in n + 1 parts completes the proof. ■ 

Recall that with respect to the mean sojourn time, bi = 1 
and b(x) = 1, while for the slowdown criterion, 6j = 1/A* 
and b(x) — 1/x. Hence, ( fTTj ) allows one to compute the mean 
cost ui z (x) due to accepting a given job in terms of sojourn 
time or slowdown. We omit the explicit expression for ui z (x) 
for brevity. 

The corresponding value functions for SPT and SRPT are 
derived in the Appendix. The numerical evaluation of the value 
function for SPTP, SPT and SRPT is not as unattractive as it 
first may seem. Basically one needs to be able to compute 
integrals of form 



F k (x) 



t k b(t)f(t) 
(l-p(t)) 2 



dt, and Gd(x) 



l-p(t) 



dt, 



where k — 0,2,4 and d = 0, 1 depending on the discipline. 
Thus, e.g., a suitable interpolation of the Ff~(x) and Gd(x) 
enables on-line computation of the value function. 

IV. Dispatching Problem 

Next we utilize the results of Section [Hi] in the dispatching 
problem with parallel servers and focus solely on minimizing 
the mean slowdown. The dispatching system illustrated in 
Fig. [T] comprises m servers with service rates Vi,...,v m . 
Jobs arrive according to a Poisson process with rate A and 
their service requirements are i.i.d. random variables with a 
general distribution. The jobs are served according to a given 
scheduling discipline (e.g., FIFO) in each server. 

The slowdown for an isolated queue was defined as the 
sojourn time T divided by the service requirement X [17| 
(both measured in time), 7 = T/X. However, as we consider 
heterogeneous servers with rates Vi, the service requirement 
X of size Y job (measured, e.g., in bytes) is no longer 
unambiguous but a server specific quantity, 

Xi = Y/vi. 

Therefore, for a dispatching system we compare the sojourn 
time T to the hypothetical service time if all capacity could 
be assigned to process a given job |7|, 



T 



7 



*75>i 



The relationship between the queue i specific slowdown 7 and 
the system wide slowdown 7* is 7* = 7 • (JT, _• Vj)jvi. 

A. Random Dispatching Policies 

The so-called Bernoulli splitting assigning jobs indepen- 
dently in random using probability distribution (pi, . . . ,p m ) 
offers a good state-independent basic dispatching policy. Due 
to Poisson arrivals, each queue also receives jobs according to 
a Poisson process with rate pi A. 

Definition 1 (RND-p): The RND-p dispatching policy bal- 
ances the load equally by setting p. L = Vi/J2j v j- 

As an example, the mean slowdown in a preemptive LIFO 
or PS queue with RND-p policy is, 



EM = ( ?"^ 



AE[Y]' 



where the denominator corresponds to the excess capacity. 
Thus, with RND-p and LIFO queues, the slowdown criterion 
is m times higher when m servers are used instead of a single 
fast one irrespectively of the service rate distribution r/j. 

For identical servers, v\ = 1/2 = . . . = v m , the RND-p 
dispatching policy reduces to RND-U: 

Definition 2 (RND-U): The RND-U dispatching policy as- 
signs jobs randomly in uniform using pi — 1/m. 

RND-U is obviously the optimal random policy in case of 
identical servers. 

In general, the optimal splitting probabilities depend on the 
service time distribution and the scheduling discipline. For the 



preemptive LIFO (and PS) server systems the mean slowdown 
with a random dispatching policy is given by 



e[7*]=q»x: 



Pi 



1 Vi - Pi ■ \E[Y]' 

where J^ Vj is a constant that can be neglected when opti- 
mizing the Pi. 

Definition 3 (RND-opt): The optimal random dispatching 
policy for LIFO/PS queues, referred to as RND-opt, splits the 
incoming tasks using the probability distribution, 



Pi 



V^G 



AE[Y] 



where G = 



Ei"«-AE[y] 

E* 



The result is easy to show with the aid of Lagrange multipliers. 
We note that when some servers are too slow, the above gives 
infeasible values for some pi, in case of which one simply 
excludes the slowest server from the solution and re-computes 
a new probability distribution. This is repeated until a feasible 
solution is found. A more explicit formulation is given in Q, 
|18|. RND-opt is insensitive to the job size distribution and 
optimal only for LIFO and PS. 

Pollaczek-Khinchin mean value formula enables one to 
analyze FIFO queues and, e.g., to write an expression for 
the mean slowdown with RND-p. Also the optimal splitting 
probabilities can be computed numerically for an arbitrary job 
size distribution. According to ([TJ, the mean slowdown in each 
queue depends on the mean waiting time E[Wj] and Ef-X"" 1 ], 
where the latter is assumed to exist and to be finite. Hence, 
the optimal probability distribution with respect to slowdown 
is the same as with the mean sojourn time. 

Similarly, the mean sojourn time in SPT and SRPT queues is 
known which allows a numerical optimization of the splitting 
probabilities. For simplicity, we consider only RND-p and 
RND-opt in this paper as compact closed form expressions 
for the pi are only available for these. 

B. SITA-E Dispatching Policy 

With FIFO queues, a state-independent dispatching policy 
known as the size-interval-task-assignment (SITA) has proven 
to be efficient especially with heavy-tailed job size distribu- 
tions |6), lfl4l . |0). The motivation behind SITA is to segregate 
the long jobs from the short ones. Reality, however, is more 
complicated and segregating the jobs categorically can also 
give suboptimal results [ 16 j. Here we assume v\ > V2 > . . . > 
v m and a continuous job size distribution with pdf j(x). 

Definition 4 (SITA): A SITA policy is defined by disjoint 
job size intervals {(fo,£i], (£i,6], • • •, (£m-i,£m]}, and 
assigns a job with size x to server i iff x E (£i-i,£i]- 

Without loss of generality, one can assume that £0 = 
and £ m — 00. Note that in contrast to random policies, SITA 
assumes that the dispatcher is aware of the size of the new job. 
In this paper, we limit ourselves to SITA-E, where E stands 
for equal load. That is, the size intervals are chosen in such 
a way that the load is balanced between the servers. 



Definition 5 (SITA-E): With SITA-E dispatching policy the 
thresholds £j are defined in such a way that 

1 f^ 1 & 

xf(x)dx = — / xf(x)dx, Vi,j. 



Thus, similarly as RND-p, also the SITA-E policy is insen- 
sitive to the arrival rate A. 

C. Improved Dispatching Policies 

The important property the above state-independent dis- 
patching policies have is that the arrival process to each queue 
is a Poisson process. Consequently, the value functions derived 
earlier allow us to quantify the value function of the whole 
system. With a slight abuse of notation, 



v* 



v 



i=i 



« , "< j ), 
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where v Zi denotes the relative value of queue i in state Zj, 
and z = (zi, . . . , z m ) the state of the whole system. 

Policy Improvement by Role Switching: One interesting 
opportunity to utilize the relative values is to switch the roles of 
two queues [19|. Namely, with an arbitrary state-independent 
policy one can switch the input processes of any two identical 
servers at any moment, and effectively end up to a new state 
of the same system. Despite its limitation (identical servers), 
switching is an interesting policy improvement method that 
requires little additional computation. The switch should only 
be carried out when the new state has lower expected future 
costs, i.e. when for some i ^ j, the rates are equal, Vi = Vj, 
and 



V (zu 



< V 



(■i, 



.)■ 



Carrying out this operation whenever a new job has arrived 
to the system reduces the state-space by removing such states 
for which a better alternative to continue exists. Formally, let 
7r(z) denote the set of feasible permutations of the queues' 
roles, where the input processes between identical servers are 
switched. Then, the optimal state-space reduction, implicitly 
defining a new policy, is given by 

z <— argmin %)& ■ 

z'£7r(z) 

To elaborate this, consider a RND-p/LIFO system. In this 
case, (|7]i implies that switching the roles of any two identical 
queues makes no difference to the relative value. 

In contrast, with FIFO discipline the interesting quantity 
from |5) for two identical servers i and j is identified to be 

A, E[i?il o 






1 



Switching the input processes between queues i and j leads 
to a state with a lower relative value if Cu + Cjj > Cij + Cji, 
Again, with RND-p the factor Aj E[Sj]/(l — pi) is a constant 
for any two identical servers and switching provides no gain. 
However, with SITA-E, even though the denominator 1 — pi 
is a constant, the numerator is not. Let Xj denote a job size 



in queue i. Then E[JQ] < E[X i+1 ], and therefore A^ > Aj+i. 
Moreover, E[l/X ( ] > E[l/X i+1 ], yielding 

X i E[l/X i }> X i+l E[l/X i+1 }. 

Consequently, with the SITA-E with switch policy, the optimal 
permutation, after inserting a new job to a queue, is the one 
with increasing backlogs for identical servers: 

Definition 6 (SITA-Es): SITA-E with switch dispatching 
policy behaves similarly as SITA-E with a distinction that after 
each task assignment the queues are permutated in such a way 
that u Zi < u Zi+1 for all identical servers i and i + 1, 

Policy Improvement by FPI: The first-policy-iteration (FPI) 
is a general method of the MDP framework to improve any 
given policy. We apply it to the dispatching problem [19|. 
Suppose that the scheduling discipline in each queue is fixed 
and that a state-independent basic dispatching policy would 
assign a new job to some queue. Given the relative values and 
the expected cost associated with accepting the job to each 
queue, we can carry out FPI: we deviate from the default 
action if the expected cost is smaller with some other action, 
thereby decreasing the expected cumulative costs in infinite 
time horizon. 

Let \ denote the arrival rate to queue i according to the 
basic dispatching policy, and Yi the corresponding job size. 
With a state-independent policy, accepting a job to queue i 
does not affect the future behavior of the other queues. Thus, 
the cost of assigning a job with size y to queue i is 

Ui(y,i) =w*.(y). 

where U)*.(y) is the mean admittance cost of a job with size 
y (measured, e.g., in bytes) to queue i, where its service 
requirement (measured in time) would be y/vi- For slowdown, 
we have an elementary relation 



<(y) 



For FIFO queues, (|6]l gives 

V, y 



: 4 0/) = (I» 



si(yA 



AiE^ 1 ] 2u Zi y + y 2 /u l 



2 Pi- A, E[Yi\ 

j 

For preemptive LIFO queues, (|8]l similarly gives 



i + (2/M)£"iii/A*. 



Vi - Xi E[Yi\ 

where m denotes the number of jobs in queue i and A* 
the initial service requirement (in time) of job j in queue i. 
According to the FPI principle, one simply chooses the queue 
with the smallest (expected) cost, 



oii(y) = argmin u*(y). 



(12) 



We refer to these improved dispatching policies simply as the 
FPI-p policy, where p denotes the basic dispatching policy, 
e.g., RND-opt or RND-p, where for brevity reasons the RND 
prefix is often omitted. Note that the factor ^\ i/j in both 
expressions is a common constant for all queues. 



RND-p state-independent random policy with load balancing 

RND-opt state-independent random policy minimizing the mean slowdown (assumes LIFO/PS) 

SITA-E state-independent size-interval-task-assignment with equal loads (optionally, with switch) 



Round-Robin assigns arriving tasks sequentially to servers [8| 

LWL~ least-work-left, assigns a job to server with the least amount unfinished work upon arrival 

LWL + same as LWL~ but based on the unfinished work (in time) including the new job |19| 

JSQ join-the-shortest-queue, i.e., the one with the least number of jobs [29 1 

Myopic minimize the mean slowdown on condition that no further jobs arrive 

FPI first policy iteration on the state-independent RND-p, RND-opt and SITA-E policies 



TABLE I 

Dispatching policies evaluated in the numerical examples. 





Server 




X 




r— III 






©- 






Dispatcher / 




a 


< 




a 


{ ► HUH 


















©- 


























\-» III 



Servers 



(a) Two identical servers (b) Three heterogeneous servers 
Fig. 5. Example dispatching systems. 



In a symmetric case of m identical servers and the RND- 
U basic policy, a corresponding FPI-based policy reduces to 
a well-known dispatching policy, i.e., with FIFO queues, FPI 
yields LWL, and with LIFO, the Myopic policy (see Table II]). 
Moreover, given additionally a constant job size, then in case 
of LIFO queues one ends up with the JSQ policy. In general, 
for constant job sizes the slowdown objective reduces to the 
minimization of the mean sojourn time. 

V. Numerical Examples 
Let us next evaluate by means of numerical simulations 



the improved dispatching policies derived in Section IV To 



this end, we have chosen to consider two elementary server 
systems that are illustrated in Fig Bj 

(a) Two identical servers with rates v\ = v<z = 1, 

(b) Three heterogeneous servers with rates vy = \ and v^ = 
v z = 1/2. 

Thus, the total service rate in both systems is equal to 2. 
We compare the mean slowdown performance of the FPI 
policies against several well-known heuristic dispatching poli- 
cies, including the state-independent policies RND-opt, RND- 
p and SITA-E. The somewhat more sophisticated least-work- 
left (LWL) policies choose the queue with the least amount of 
unfinished work (backlog). The difference between the LWL 
policies is that LWL~ considers the situation without the new 
job, and LWL+ afterwards. With identical servers the LWL 
policies are equivalent. JSQ chooses the queue with the least 
number of jobs. With the LWL and JSQ, the ties are broken 
in favor of a faster server. The Myopic policy assumes in 
a greedy fashion that no further jobs arrive and chooses the 
queue which minimizes the immediate cost the known jobs 
accrue. All policies are listed in Table II] 



In many real-life settings, the job sizes have been found 
to exhibit a heavy-tailed behavior, (cf., e.g., file sizes in the 
Internet). Thus, in most examples we assume a bounded Pareto 
distribution with pdf 



/(*) = 



\k a 



1 - (k/p) a 



k < x < p, 



(13) 



where (k,p,a) = (0.33959, 1000, 1.5) so that E[X] « 1. This 
job size distribution is particularly suitable for SITA-E. 

A. FIFO 

First we assume that servers operate under the FIFO 
scheduling discipline and job sizes obey the bounded Pareto 
distribution (13) . With FIFO, the departure time gets fixed 
at dispatching and the Myopic policy corresponds to selfish 
users choosing the queue that guarantees the shortest sojourn 
time, i.e., LWL+. Similarly, LWL~ is equivalent to an M/G/m 
system with a single shared queue. 

Two identical servers: Fig. [6] (left) depicts the results with 
two identical servers with rates V\ = v-j = 1. The x-axis 
corresponds to the offered load p and the y-axis to the relative 
mean slowdown, E[7]/E[7stta-e]j i-e-, the comparison is 
against SITA-E. We find that the policies appear to fall in 
one of the three groups with respect to slowdown: SITA 
policies form the best group, other queue state-dependent 
policies the next, and RND-U and Round-Robin have the 
worst performance. Especially, FPI-SITA-E outperforms the 
other policies by a significant margin, including the other 
SITA policies. Note also that when p is small, the sensible 
state-dependent policies utilize idle servers better than, e.g., 
SITA-E. 

Fig. [Tjillustrates SITA-E and FPI-SITA-E in the same setting 
at an offered load of p — 0.5. The x- and y-axes correspond 
to the backlog in Queue 1 and Queue 2, respectively, and 
the z-axis to the maximum job size a given policy assigns 
to Queue 1. With SITA-E, this threshold is constant, while 
FPI-SITA-E changes the threshold dynamically. One observes 
that as a result of FPI, Queue 1 has become a "high-priority" 
queue where no jobs are accepted whenever Queue 1 has more 
unfinished work than Queue 2. Similarly, when Queue 2 has 
a long backlog, the threshold for assigning a job to Queue 1 
becomes higher with FPI than with SITA-E. 



FIFO: X~Bounded Pareto, two identical servers 



FIFO: X~Bounded Pareto, three servers 
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Fig. 6. Numerical results with two identical (left) and three heterogeneous servers having rates v 
and jobs sizes obey bounded Pareto distribution. 
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(1, 0.5, 0.5) (right). The scheduling discipline is FIFO 




LIFO: X~Bounded Pareto, different server constellations 



Fig. 7. SITA-E and FPI-SITA-E policies in the setting of two identical FIFO 
queues, bounded Pareto distributed job sizes and offered load p = 0.5. x- 
and y-axes correspond to the backlog in Queue 1 and Queue 2, respectively, 
and z-axis to the maximum job size that a policy assigns to Queue 1. 
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Fig. 9. Comparison of the chosen server constellations 
capacity: i) single server, ii) two identical servers, and iii 



with an equal total 
) three servers. 



Three heterogeneous servers: Consider next the asymmetric 
setting comprising one primary server with service rate v\ = l 
and two secondary servers with rates V2 = v?, = 1/2. Fig. [6] 
(right) illustrates the results from a numerical simulation. 

The state-independent random policies are much worse than 
the other policies. However, state-independent SITA-E again 
proves out to be better than the state-dependent policies JSQ, 
LWL", LWL+, Myopic, FPl-p and FPI-opt. The gain from 
switching the roles of the queues (SITA-E vs. SITA-Es) is 
smaller in this case due to the fact that we may only switch 
the roles of the two slower queues having the same service 
rate v = 0.5. As expected, the FPI-SITA-E policy yields 
a significantly lower mean slowdown than any other policy 
especially under a heavy load. 

Based on the results, one can assume that the relative values 
obtained for the random policies simply do not capture the 
situation sufficiently well, while SITA-E is already a good 
dispatching policy, and FPI then leads to "adaptive" size 
intervals. 

B. Preemptive LIFO 

Next we assume that the servers are bound to operate 
under LIFO, which, as mentioned, has a reasonably robust 
performance with respect to the mean slowdown. Job sizes 
are again assumed to obey the bounded Pareto distribution. 



Two identical servers: Fig. [8] (left) illustrates the perfor- 
mance with respect to the slowdown criterion for two identical 
servers. On z-axis is again the offered load p and y-axis 
corresponds to the relative mean slowdown (here comparison 
is against the Myopic policy). One can identify three per- 
formance groups: RND and LWL form the worst performing 
group, then come Round-Robin and JSQ, and the Myopic and 
FPI-RND policy achieve the lowest mean slowdown. 

Three heterogeneous servers: Fig. [8] (right) illustrates the 
simulation results in the asymmetric setting comprising one 
primary server with service rate V\ = 1 and two secondary 
servers with rates v<x = vj, = 1/2. In this case, the Myopic 
approach is the optimal (among the candidates) while both FPI 
policies attain almost identical mean slowdown. Even though 
the value function for the Myopic policy is not available for 
us, one can estimate the relative values by means of simulation 
and carry out the policy improvement numerically. However, 
in this paper we do not pursue into this direction. 

Server constellations: In Fig. [9] we compare the mean 
slowdown between different server constellations assuming the 
preemptive LIFO scheduling discipline. The y-axis represents 
the relative performance when compared to the two server 
system. To no surprise, when the load is small or moderate, 
a single server system achieves the lowest mean slowdown. 



LIFO: X~Bounded Pareto, two identical servers 



LIFO: X~Bounded Pareto, three servers 



/ 
§l/\ 




\J/\<P 




^-^^^ 7-——"^^^ 


■& 




Myopic/FPI 





/ A/ 


/ £> 


/ 


RND-p^^- 






1& 


y< FPI-Opt 


; Myopic 



0.2 0.4 0.6 

Offered load p 



0.2 0.4 0.6 

Offered load p 



Fig. 8. Numerical results with preemptive LIFO scheduling discipline for two identical (left) servers with rates v\ = u? = 1, and three heterogeneous 
servers with rates u\ = 1 and u% = u 3 = 1/2. 



SPTP: X~Bounded Pareto, two identical servers 



SPTP: X~Bounded Pareto, three heterogeneous servers 
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Fig. 10. Numerical results with SPTP for two identical (left) and three heterogeneous servers (right). 



However, as the load increases more, first the two server 
system becomes optimal, and then eventually the three server 
system takes the lead. This is due to the fact that at higher 
load levels, the correct dispatching decisions allow one to serve 
more expensive jobs faster. One interesting research question 
indeed is that given a total capacity budget, what is the optimal 
server constellation so as to minimize the mean slowdown. 

C. SPTP 

Finally, let us consider the SPTP scheduling discipline that 
is the natural choice when minimizing the mean slowdown 
(see Section [III, The job sizes are again assumed to obey 
the bounded Pareto distribution. Fig. 10 (left) illustrates the 
slowdown performance with the two identical servers. Interest- 
ingly, LWL becomes even weaker than RND-U as the offered 
load increases. Myopic and FPI-RND achieve the lowest mean 
slowdown. 

In the case of three heterogeneous servers, the situation is 
again more challenging and the numerical results are depicted 
in Fig 



significantly better than with any other policy when p > 0.5. 

D. Comparison of scheduling disciplines 

So far we have fixed a scheduling discipline and evaluated 
different dispatching policies. Here we give a brief comparison 
of different scheduling disciplines with respect to the slow- 
down metric. Job sizes obey either (a) uniform distribution, 
Y ~ U(0.5, 1.5), or (b) the bounded Pareto distribution, both 
having the same mean, E[Y] = 1. 



First we consider a single server queue. Fig. 1 1 (left) depicts 
a comparison against LIFO with the same job size distribution 
on the logarithmic scale. In accordance with (0), FIFO is better 
than LIFO with uniform job size distribution, and with the 
bounded Pareto distribution the situation is the opposite. SPTP 



is clearly the best in both cases. In Fig. 1 1 (right), the reference 
level is the mean slowdown with SPTP. Here we have included 
also SRPT, which turns out to be only marginally better than 
SRPT, as observed also in ||28ll . LIFO and PS are significantly 
worse. 



10 (right). At low levels of load, RND-opt is a Fig. 12 illustrates the relative performance between different 



surprisingly good choice, suggesting that SPTP manages to 
locally "correct" the occasional suboptimal decisions. When 
the load increases, the LWL policies become weak again. The 
Myopic policy shows a robust and good performance at all 
levels of offered load. However, the FPI policies, and FPI- 
opt in particular, achieve the lowest mean slowdown which is 



scheduling disciplines and dispatching policies in the hetero- 
geneous three server dispatching system. SRPT is not included 
as its performance is very similar to SPTP. The reference 
level is the mean slowdown a JSQ/FIFO (left) and JSQ/LIFO 
(right) achieve. FPI manages to outperform the corresponding 
JSQ policy in all cases. Especially with the bounded Pareto 
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Single Server: SPTP vs. Other 
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Fig. 11. Comparison of different scheduling disciplines in a single server queue. FIFO is better than LIFO with some job size distributions (left). SPTP is 
only marginally better than SRPT (right). 
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Fig. 12. Heterogeneous dispatching system with different scheduling disciplines. 



distributed job sizes, FPI/LIFO performs rather well when 
compared to JSQ/LIFO. SPTP is clearly superior scheduling 
discipline in both cases, as expected. 

In general, the scheduling discipline seems to have a 
stronger influence to the performance than the dispatching 
policy. 

VI. Conclusions 

This work generalizes the earlier results of the size-aware 
value functions with respect to the sojourn time for M/G/l 
queues with FIFO, LIFO, SPT and SRPT disciplines to a gen- 
eral job specific holding cost. Additionally, we have considered 
the SPTP queueing discipline, the optimality of which with 
Poisson arrivals was also established. As an important special 
case, one obtains the slowdown criterion, where the holding 
cost is inversely proportional to the job's (original) size. These 
results were then utilized in developing robust policies for 
dispatching systems. In particular, the value functions enable 
the policy improvement step for an arbitrary state-independent 
dispatching policy such as Bernoulli-splitting. The derived 
dispatching policies were also shown to perform well by 
means of numerical simulations. In particular, the highest 
improvements were often obtained in a more challenging 
heterogeneous setting with unequal service rates. Also the 
SPTP scheduling discipline outperformed the other by a clear 



margin in the examples, except the SRPT discipline, which 
appears to offer a rather similar performance in terms of 
slowdown and sojourn time. 
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Appendix 

Carrying out the similar steps as with SPTP, it is straight- 
forward to derive value function also in the case of SPT and 
SRPT policies. Again, we let f(x) denote the pdf of the job 
size distribution and p(x) the offered load due to jobs shorter 
than x according to ((9). The backlog due to higher priority 
work in this case is u z (x), i.e., where the priority index for 
SPT is the initial service time A*, and for SRPT the remaining 
service time A.;. 

A. M/G/1-SPT 

We consider the non-preemptive SPT, which is the opti- 
mal non-preemptive schedule with respect to both the mean 
sojourn time and the mean slowdown. Thus, the remaining 
service time of a queueing job is equal to the initial service 
time. Therefore, a sufficient state description for a non- 
preemptive M/G/1-SPT queue with arbitrary holding costs 
is z = ((Ai,oi);...; (A„,a„)), where job 1 (if any) is 
currently receiving service and jobs 2, . . . , n are waiting in 



the queue. Without lack of generality, we can assume that 
A 2 < A 3 < . . . < An. 

Proposition 4: For the size-aware value function with re- 
spect to arbitrary holding costs in an non-preemptive M/G/l- 
SPT queue it holds that, 
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= EM A > + i 



EfciA, 



P(A, 



+ 



(14) 



i \ 2 n 



i=l \ \ \,'=1 
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where b(x) — E[B | X—x] and the integration intervals are 

0, i = \ 
A, = { A z , i = 2,...,n 

cxd, i = n + 1. 

Proof: Similarly as with the SPTP, we compare two sys- 
tems: System 1 initially in state z and System 2 initially empty. 
The two systems behave identically once System 1 becomes 
empty for the first time. We can write u z — vq = h\+hi, where 
h\ is the cost the n jobs initially present (only) in System 1 
incur, and hi the difference in cost the later arriving customers 
incur between the two systems. 

First, the remaining sojourn time of job i in a non- 
preemptive M/G/1-SPT queue is 



E[Jfc 



A,- 



e£ a , 

i-p(A,r 



(15) 



which holds for all n jobs present only in System 1. Therefore, 
their contribution to the v z — Vq is 



hi = y^ bj 



i=l 



l-p(Ai) 



For the later arriving jobs, we condition the derivation on 
job sizes (x, x+dx), which arrive at rate of A f(x) dx. Let i(x) 
denote the mean additional sojourn time such jobs experience 
in System 1 when compared to System 2. Let b(x) = E[B 
X=x] so that we have an expression for hi 



t(x) ■ b{x) dx. 



Instead of considering actual arrivals, we focus on a rate at 
which virtual costs are accrued in order to find t{x). With aid 
of ( (IB) , one can deduce that the virtual cost rate is equal to 



A/(z) 



l-p(xY 



where U z (x,t) denotes the amount of work at time t that 
would be processed before an arriving size x job when a 
system was initially in state z. In particular, we can write 
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Fig. 13. Derivation of value function for an M/G/1-SPT queue. 



which gives 

t(x) 



A/(s) 



E 



U z (x,t)-U z (x,t)di\. 



!-p0) jo 

Let k z (x) denote the number of jobs waiting in the queue 
with a remaining service time greater than x. 

k z {x) = \{ie{2,...,n} : Ai> x}\, 

Due to the non-preemptive discipline, the evolution of U z (x, t) 
for z ^ consists of k z (x) + 1 phases. Let m — n — k z {x). 
During the first phase jobs 1, . . . , m are served and the initial 
backlog size x jobs see is Ai + ... + A m . The second 
phase starts when job m + 1 enters the server and cannot 
be preempted, thus creating an initial backlog A m+ i for size 
x jobs, etc. 

Considering arrival sample paths, one can deduce that 
the difference J U z (x,t) ~ U z (x,t)dt corresponds to the 
white marked region in Fig. 13 which consists of k z {x) + 1 
statistically independent areas. The area of a phase starting 
with an initial backlog of u from size x customers' point of 
view is given by 



- + \F{x)u 
Defining 
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A„ 

For example, when x > A„, all current n jobs in System 
1 have a higher priority constituting a single phase with u = 
Ai + . . . + A„, etc. Next define the integration intervals, Ai = 
0, Ai = Ai for i = 2, . . . ,n, and A„ + i = oo. Substituting 
these into the above gives 






•E 



A 



Kx) f(x) 
{l-p{x)f 



dx, 



B. M/G/1-SRPT 

While SPT was optimal in the class of non-preemptive 
schedules, the SRPT discipline is the optimal preemptive 
schedule 0251 . For SRPT, a sufficient state description is z = 
((Ai,ai); . . . ; (A„,a n )), where, without loss of generality, 
we again can assume that job 1 is currently receiving service 
(if any) and A x < A 2 < . . . < A„. For the total higher 
priority workload we have u z (Ai) = X)i=i Ar 

Proposition 5: For the size-aware value function in an 
M/G/1-SRPT queue with arbitrary holding cost it holds that 
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Proof: Again, we consider System 1 initially in state z 
and System 2 initially empty, and find expressions for hi and 
h 2 corresponding to the cost the n jobs initially present (only) 
in System 1 incur, and the difference in cost the later arriving 
customers incur between the two systems. 

In a M/G/1-SRPT queue at state z = (Ai, . . . , A„), the 
mean remaining sojourn time of a job with a (remaining) size 
A and an unfinished work u z {A) ahead in the queue is given 
by ED 



E[i? z (A)] = 



u z (A) 
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which gives the expected cost the current n 
System 1 incur, 



present in 
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For the later arriving jobs, we can again write 
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which describes the expected difference in the cumulative 

sojourn times between System 1 and System 2 for size x jobs. 

With SRPT also the jobs in System 1 initially longer than 

x affect the result as eventually, one at a time, they decrease 



which completes the proof. 



4 Note that we assume the opposite order than 1191 . 
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Fig. 14. Derivation of value function for an M/G/1-SRPT queue. 



below the threshold x and trigger a new higher priority 
workload present only in System 1 (see Fig.[T4)i. Consequently, 
we have one phase starting with a backlog of u z (x), and 
then n z (x) phases starting with a backlog of x, where n z (x) 
denotes the number of jobs longer than x in state z. In some 
sense, this is the same as with SPT, but the difference is the 
initial backlog size x jobs see: with SPT it is Aj instead of x 
for later phases. Hence, each of these phases reduce to ( fT6| ), 
which gives 



t(x) 



A/(x) [u z {x) 2 + n z (x)x 2 



2(1 -p(x)) 2 

As hi = J b(x)i(x)dx, and both u z (x) and n z (x) are 
(monotonic) step functions having jumps at x = Ai, . . . , A n , 
we proceed by integrating in parts. Defining 
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then gives 
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and recalling that v z — vq = hi + h-2 then gives ( fTT] ). 



